A Comparative Study on Decision Rule Induction for incomplete data using Rough Set and Random Tree Approaches

نویسندگان

  • M. Sandhya
  • C. Senthamarai
چکیده

Handling missing attribute values is the greatest challenging process in data analysis. There are so many approaches that can be adopted to handle the missing attributes. In this paper, a comparative analysis is made of an incomplete dataset for future prediction using rough set approach and random tree generation in data mining. The result of simple classification technique (using random tree classifier) is compared with the result of rough set attribute reduction performed based on Rule induction and decision tree. WEKA (Waikato Environment for Knowledge Analysis), a Data Mining tool and ROSE2 (Rough Set Data Explorer), a Rough Set approach tool have been used for the experiment. The result of the experiment shows that the random tree classification algorithm gives promising results with utmost accuracy and produces best decision rule using decision tree for the original incomplete data or with the missing attribute values (i.e. missing attributes are simply ignored). Whereas in rough set approach, the missing attribute values are filled with the most common values of that attribute domain. This paper brings out a conclusion that the missing data simply ignored yields best decision than filling some data in the place of missing attribute value. KeywordsRandom Tree, WEKA, ROSE2, Missing attribute, Incomplete dataset, Classification, Rule Induction, Decision Tree.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)

Decision support systems (DSSs) are prevalent information systems for decision making in many competitive business environments. In a DSS, decision making process is intimately related to some factors which determine the quality of information systems and their related products. Traditional approaches to data analysis usually cannot be implemented in sophisticated Companies, where managers ne...

متن کامل

A Comparative Study on Strategies of Rule Induction for Incomplete Data Based on Rough Set Approach

Rough set based rule induction approaches have been studied intensively during past few years. However, classical rough set model cannot deal with incomplete data sets. There are two main categories dealing with this problem: the preprocessing methods and the extensions of rough set model. This paper focuses on the comparison of three strategies for dealing with incomplete data containing three...

متن کامل

VC-DomLEM: Rule induction algorithm for variable consistency rough set approaches

We present a general rule induction algorithm based on sequential covering, suitable for variable consistency rough set approaches. This algorithm, called VC-DomLEM, can be used for both ordered and non-ordered data. In the case of ordered data, the rough set model employs dominance relation, and in the case of non-ordered data, it employs indiscernibility relation. VC-DomLEM generates a minima...

متن کامل

Cluster-Based Rough Set Construction

In many data mining applications, cluster analysis is widely used and its results are expected to be interpretable, comprehensible, and usable. Rough set theory is one of the techniques to induce decision rules and manage inconsistent and incomplete information. This paper proposes a method to construct equivalence classes during the clustering process, isolate outlier points and finally deduce...

متن کامل

A comparative study on rough set based class imbalance learning

This paper performs systematic comparative studies on rough set based class imbalance learning. We compare the strategies of weighting, re-sampling and filtering used in the rough set based methods for class imbalance learning. Weighting is better than re-sampling, and re-sampling is better than filtering. The weighted rough set based method achieves the best performance in class imbalance lear...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013